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Abstract. We will discuss the maximum entropy production (MaxEP) principle 
based on Jaynes' information theoretical arguments, as was done by Dewar (2003, 
2005). With the help of a simple mathematical model of a non-equilibrium system, we 
will show how to derive minimum and maximum entropy production. Furthermore, 
the model will help us to clarify some confusing points and to see differences between 
some MaxEP studies in the literature. 
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1. Motivation 

The maximum entropy production (MaxEP) is believed to be an organizational principle 
applicable to physical and biological systems (see reviews in |9l |13l El). There are 
different attempts to theoretically proof MaxEP. The most detailed mathematical 
studies were done in two papers by Dewar [HE]. Dewar proposed different derivations 
of MaxEP by using the maximum information entropy (MaxEnt) procedure by Jaynes 
P, [7]. It is a similar argument as the derivation of the Gibbs ensemble in equilibrium 
statistical mechanics, but with the crucial difference that the information entropy is not 
defined by a probability measure on phase space, but on path space. 

In this article, we will comment on the arguments by Dewar. We will do this with a 
rather simple model which one can easily solve. The most important conclusion is that 
Dewar discussed basically three different derivations, leading to at least three comments: 
-The derivation in [1] leads in the linear response regime to the well known minimum 
entropy production (MinEP, [IDIIII]) instead of MaxEP. 

-The derivation in the main text of [2] works only in the linear response regime, and 
leads to Ziegler's MaxEP principle [19] or a 'linear response' MaxEP principle as used 
in e.g. [20]. 

-The derivation in the appendix of [2j contains some unresolved remarks that need fur- 
ther clarification. We will see that this derivation is related with what we will call the 
'total steady state' MaxEP. 

Furthermore, Dewar refers to the work by e.g. Paltridge [16] on climate systems. We 
will demonstrate that Paltridge's MaxEP principle, which we will call 'partial steady 
state' MaxEP, is different from the above mentioned MaxEP principles in. Hence, 
'partial steady state' MaxEP is unrelated with the principles derived in [H [2]. This 
will lead us to the important conclusion that there are different MaxEP principles 
discussed in the literature, often leading to some confusion. In fact, some studies 
[T5| [T8] . in particular in fluid dynamics, discussed even another principle, what we 
will call the 'non-variational MaxEP' principle. In the appendix we point at a rough 
analogy between Paltridge's 'partial steady state' MaxEP principle and an equilibrium 
system with MaxEnt. There are some theoretical problems associated with this analogy, 
but nevertheless we present it to clarify the line of reasoning of MaxEP and its possible 
information theoretical derivation. 

2. The model 

Let us consider a system of l sites, with a real variable ni{t) {i = 1, at each site. 
These variables depend on the discrete time t = 0, 1, ..,r. At every timestep there is 
a random flux between the sites. The flux fij = —fji from i to j depends on a real 
constant parameter Cij = Cji, such that fij{t) = ±Cij where the sign is stochastic. A 
microscopic path P is a speciflc set of values +Cij or — for every timestep and every 
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i and j. The pathspace is the set of all possible paths. The sign stochasticity gives a 
stochastic dynamics, such that for each microscopic path we have 

ni,r(t + 1) - Ui^rit) = - E hAt)- (1) 

j 

The time averages depend on the path F and are denoted with an overline, e.g. 
fij,r = ■^Htfijif)- Foi' sach microscopic path, we assign a probability pr- The path 



ensemble averages are written with brackets, e.g. y/ij/ = J2rPrfij,r- 

To find the most likely probability measure on path space, we will use Jaynes' 
information theory formalism by maximizing the path information entropy 

S'7 = -Eprlnpr (2) 
r 

under the constraints 

EPr = l, (3) 
r 

EPrri.,r(0) = (n,(0)), (4) 
r 

J2Pr7^=Pij^ (5) 
r 

for some (or all) i and j. These constraints were used by Dewar [1] and are to be 
interpretted as follows: 

The first constraint is the normalization of the probability measure. 
The second constraint means that at the initial time, the (path ensemble average) value 
of rii is measured. nj,r(0) is not dependent on the complete path, but only on the initial 
time value of the path. 

The third constraint means that the time and path ensemble average of the flux from i 
to j is measured to be the numerical value Fij. 

The maximum of Sj under the constraints results in 

Pr = ^ exp Ar (6) 

with the path action 

= E ^i^iri^) + E Vijfij,r, (7) 

i ij 

with Aj and rjij = —rjji Lagrange multipliers of constraints (jl]) and (jSD respectively. By 
deriving 

, , dlnZ , . 

(r^.(0)) = -g^, (8) 

and using (jlj), we get (^i(O)) = njj(O). This basically means that constraint (jl]) is 
trivially satisfied due to ([3]), so we can take A, = 0. The reason behind this is that ni{0) 
did not depend on the complete path. 

The partition sum and the Lagrange multipliers rjij can be easily calculated: 

^=n%=n(2cosh^)^ (9) 

ij ij 
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OTjij T 



X,. = !hl = ^arcth^. (11) 

T Cij Cij 

One can split the time and ensemble averaged fluxes F in forward and a backward 
components: 

F,, = F^-F^ (12) 

p+ 

such that 2cijXij = In(-p-) (which is a well known expression for the thermodynamic 

forces for e.g. elementary chemical reactions [TT]). flTT]) are the constitutive 
(phenomenological) equations of motion. 

Notice that our description of a stochastic non-equilibrium dynamical model is 
mathematically equivalent with a statistical equilibrium ferromagnetic spin model. This 
can be seen by interpreting the fluxes fij(t) as the values of the ferromagnetic spins Sij^f 
These spins take values ±Qj for every t. Instead of time, t is interpretted as a spatial 
coordinate, so for every i and j we have a one dimensional spin chain. The observed 
averaged fluxes Fij correspond with the observed mean magnetisations niij for every spin 
chain. In the equilibrium spin model, the multipliers rj are basically inverse temperatures 
of the chains. In the non-equilibrium interpretation, these multipliers are related with 
the thermodynamic driving forces X which are conjugate to the fluxes F. 

As in [1], the entropy production (EP) of a microscopic path F will be defined as 
the time antisymmetric (irreversible) part of the action, written as ar = A^f/r. In our 
example, the fluxes are all time antisymmetric and there is no symmetric (reversible) 
part of the action, so we have ar = J2ij Xijfij,r- The expectation value of the EP is (for 
convenience written without brackets) 

(T = 5Z Prcrr = ^ Xij ^ prfij,r = ^^i^u ' (14) 
r ij r ij 

which is the classical expression for the EP as a bilinear form of forces and fluxes. 

Plugging the solution ([6]) into ([2]), we get the maximum information entropy as a 

function of the forces: 

Si,maAX) = \nZ{X) - {A{X)) (15) 

2 cosh (XjjCjj) 
exp{XijCijtaiih.{XijCij)) 




(16) 



^ \nW{{A{X))) (17) 

with W{{A{X))) the 'density of paths': The number of paths with approximately the 
average {A{X)) as path action. 

Next we introduce the entropy curvature (or response) matrix as in [2] 

AMF) - ^ (18) 
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(19) 



'^kl ^kl 

with 6ij^ki the Kronecker delta matrix {6ij±i = 1 iS i = k and j = I)- 

With the steepest descent approximation (i.e. a quadratic expansion around average 
F), as in |2j, we can calculate the probability distribution for the time averaged flux 



pif) oc exp - Fij]A,^ki{F)[fki - Fm] . 

V ijM J 

Combining this expression with the the fluctuation theorem [U [5] 



(21) 



= exp(2rcr(/)) = exp{2Tj2X,,f,^), (22) 



Pi-f) 

and taking together the terms linear in / in the exponent, Dewar [2] derived another 
expression for the constitutive equation (compare with flTT]) ): 

^ij = ^A.ij^ki{F)Fki. (23) 

kl 

Below, we will point at some hidden assumption in this derivation, clarifying the 
difference between (|TT1) and fl23l) . 

As a final definition, we introduce the dissipation function as in [2] 

D{F) = 2 J2 Mki{F)FijFki. (24) 

ij,kl 

In the linear response regime near thermodynamic equilibrium, all forces X are 
small and by ( fTTI) they are (approximately) linearly related with the fluxes as 

-^ij,lin ~ Fij / C^j (25) 

In this regime, the two constitutive equations ffTTj) and fl23l) become equal to fl25l) . and 
the dissipation function (approximately) equals the EP 

DiUF)^cr{X{F),F). (26) 

This is the basic set-up, as discussed in [H |2]. Now we will give some comments on 
Dewar's arguments. 



3. Comments 



3.1. Linear response minimum entropy production 

The first article pL] focused on the non-equilibrium steady state. Up till now, the forces 
Xij (and the parameter values Cij) were supposed to be constants. However, in most 
systems, they can change. Let us introduce a new, longer timescale T = t/r. The 
forces, fluxes and parameters are approximately constant for short timescales < t < r, 
but they can slowly change. Suppose the system X(T), F(T) attains a steady state for 
T — >■ oo. What happens with the EP cf{T)1 
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As can be seen by the counting argument fll7p . we can calculate W in the linear 
response regime for small X: 



wmx))) 



n(2 - (x,,c. 



with this simplification, the path information becomes 
Si,maAX) = HW) ^ ln2^(^'-^) + r In ( 1 
^ ln2^(^^-)--. 



(27) 

(28) 
(29) 



The interpretation of this result is clear: The first term on the right hand side is the 
logarithm of the total number of paths (for a uniform probability distribution). The 
second term only contains the EP. In [1], an important assumption was made in order to 
derive MaxEP: The number of paths W should be an increasing function of the averaged 
action. This averaged action is proportional with the EP, and hence one could claim that 
the higher the EP, the higher Sj^max- However, here we obtain the reverse, resulting 
in a minimization of the entropy production (MinEP). Suppose there are additional 
constraints such as 



(30) 



with Xg constant 'external' forces. An example of this kind of constraint on the 
forces is the Kirchoff loop law in electrical networks. Minimizing the EP a under 
these constraints, and using the linearized constitutive equations (!25|) . one can find 
(for T —>■ oo) the unique steady state (written with *), which for site i is given by 



(31) 



(as follows from ([T])). We conclude that the derivation in [T] can be used to derive 
MinEP rather than MaxEP, because the assumption that is a decreasing function of 
a is valid in our model. 



3.2. Ziegler's and linear response MaxEP 

Let us now comment on [2j. Ziegler [19] has proposed a MaxEP principle to derive the 
constitutive equations. It only works for systems in the linear response regime (and 
some highly restricted exceptional cases mentioned in p2] , but we will not discuss them 
here). It is variational principle, with Lagrangian which is to be maximized: 

CziegleriF) = D{F) + ^{D{F) -2j2X^^Fi,). (32) 

The last term is a constraint with Lagrange multiplier 7. In this variation, X° is kept 
fixed. It is this 'maximum dissipation' principle that was explained in [2], eq. (22)0. 



t In 2 , a 'dual' version is applied, switching the roles of the forces and the fluxes. It is mathematically 
equivalent with our formulation. 
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It is important to keep in mind that (contrary to what is claimed in [2]) the 
constitutive equations derived from the above Lagrangian are only compatible with 
( |T8|) and ( 123|) when Fij qfi '"' ~ ^- clear that this restriction does not hold in 
the non-linear regime of our model with constitutive equations (ITTl) . Only in the linear 
response regime (when A is a constant matrix, leading to (l25l) ) is (1231) compatible with 
ffTTj) . The reason why the derivation in [2] only works near equilibrium (i.e. in the 
linear response regime) is due to the use of a steepest descent approximation in fl2T|) . 
This works only when the fluxes fij are close to their expectations Fij. But using the 
fluctuation theorem fl22|) . also —fij should be close to F^j. This is only possible when 
Fij is small. 

The derivation in [2] has also another application, as one can add more constraints 
to the above Ziegler's principle in order not to find the constitutive equations, but to 
find the unique near-equilibrium steady state. This is also a MaxEP principle, which 
we will name 'linear response MaxEP' because it only works in the near-equilibrium 
linear response regime|§| Zupanovic et al. [20] discussed this principle with an electrical 
network as an example, whereby the forces are the voltages. The Lagrangian generally 
looks like 

Cunear = D{F) + ^o{D{F) - ^e^e) + E 7e(E f^^j,eF^j - F,). (33) 

e e ij 

The second term on the right hand side is the constraint which says that in the steady 
state the power influx into the system due to the fixed external driving forces 
(with conjugate external fluxes F^ that do not contribute to the dissipation D{F)) 
is completely dissipated. In [20], this fixed external driving force is the applied voltage 
of a battery, and F^ is the current through this battery. The last term (with constants 
(3ij,e) is a steady state constraint on the fluxes. In the electrical network example in [20] 
it is the Kirchoff 's current law. 

Note that, as in the previous comment, we can use a counting argument to derive 
Ziegler's or linear response MaxEP. In the near equilibrium regime we have (!26|) and 
W{a) = W{D) becomes maximal under the constraints in ( |32l) or (l33l) . 

3.3. Partial steady state MaxEP 

In his two papers, Dewar also refers to the work by Paltridge that gives experimental 
validation of the MaxEP principle. The basic idea of the climate model of Paltridge is 
similar to the idea in e.g. [3] or [8] for chemical reactions. Paltridge divides the universe 
in compartments (sun, equator, pole and deep space) with energy fluxes between them, 
just as the chemical reaction system of ATP synthase in |3j consists of compartments (the 
different molecular states) with particle fluxes between them. In the Paltridge model, 
there is atmospheric heat transport from equator to pole, and its transport coefficient is 

§ As was correctly noted in [2], this principle should not to be confused with the linear response 
minimum entropy production principle [llj , which uses other constraints resulting in a minimum of the 
EP at the near-equilibrium steady state. 



A discussion on maximum entropy production and information theory 



8 



a priori not known. This coefficient is guessed by maximizing the EP associated with the 
atmospheric heat transport processes. The other processes and parameters related with 
the heat radiation (e.g. from sun to equator) are a priori known, and the earth system 
is supposed to be in the steady state. Note that not the total EP is maximized. In [3], 
a parameter k and the flux F{k) between the compartments 0:ATP and 0:P.ADP are 
unknown. The most likely values for this parameter and flux are derived by maximizing 
the corresponding EP (not the total EP of all reactions), knowing that the system is in 
the steady state. 

Making the analogy with our model, we can take a system consisting of three 
compartments (sites with 6 = 3), with parameters cis = and Ci2 7^ a priori known. 
As the atmospheric heat transport coefficient or the k parameter, C23 is unknown, and it 
is guessed by maximizing the corresponding partial EP (123 = -^23-^23 under the steady 
state conditions. This explains the name 'partial steady state' MaxEP. The steady state 
conditions are e.g. 

Xi2+X23 = X0 (34) 

(as a specific example of (!30|) ) with known and fixed, and 

^^1*2 = ^2*3- (35) 

(This is the steady state condition for the middle site 2, as a specific example of ( |3T|) . 
The total system, including sites 1 and 3, is not in the steady state, except when the 
total system is in equilibrium.) 

Under these constraints, the partial EP can be written as 



1 . F* 



^2*3 arcth^ . (36) 



'^23 

V ^ C12 C12 . 

The maximum gives a complicated expression of ^2*3 ^^^(Xg , C12) as a function of the 
known parameters. This also gives C23,maa;(Xg , C12). Although it is believed [8l [T6l [T^ 
that this principle is applicable to the far-from-equilibrium regime, we can also look 
at the linear response regime, where it is easy to calculate that C23,max = cu and 

_ £12 "V^O 
^ 23,max ~ 2 e • 

As mentioned above, [1] results in minimum EP and |2] results in Ziegler's or linear 
MaxEP, and these principles are different in nature than Paltridge's MaxEP principle 
discussed here. In the appendix we give an analogy of our model with an equilibrium 
model. Although theoretically not very rigid, the discussed analogy might serve as 
a general guideline to clarify the partial steady state MaxEP. For the moment, it is 
important to stress that this MaxEP principle remains an unproven hypothesis with a 
lot of controversy and unsolved questions about the necessary conditions, requirements 
and ranges of application. 



3.4- Total steady state MaxEP 



In the appendix of [2J, Dewar gives a third information theoretical derivation for MaxEP. 
An important assumption is made: The total dissipation or (more generally) the total 
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entropy production should have an upper bound a{X{F), F) < Omax under some prior 
information C (such as the knowledge that the system is in the steady state fl34ti35p ). 
Looking at the example in section [3731 in the steady state in the linear response regime, 
the EP becomes 

a* = XlFl, = 4T^iXef < cUX'.f- (37) 

The maximum is attained for C23 — > 00. This is not compatible with C2z,max = Cu 
obtained by maximizing the entropy production of the unknown flux in section I3.3[ As 
we varied the total EP in the steady state with respect to an unknown parameter, this 
explains the chosen name 'total steady state MaxEP'. 

One can place questions about the choice of constraints used in Dewar's appendix 
derivation. Why not add the inequality constraint a > as a consequence of fl22l) . or the 
steady state constraints ( I34ll35l) ? And is the obtained probability measure a maximum 
of the information entropy? We will not deal with these questions here, as they should 
be taken up in future work. 

3.5. Non-variational MaxEP 

At the end of his paper, Dewar |2] mentions the Rayleigh-Benard convective fluid system. 
Others (e.g. fibl [T8]) have made a MaxEP hypothesis for other fluid systems. We will 
call this principle the 'non-variational' MaxEP, because contrary to the above mentioned 
principles, it is a selection principle rather than a variational principle varying the EP 
with respect to some continuous variable. 

Suppose a system has a highly non-linear dynamics, resulting into the possibility 
of having a discrete set of steady states. The hypothesis claims that the selected state 
(e.g. the most stable) is the one with highest EP of all the steady states. E.g. in the 
Rayleigh-Benard system, the steady states are a heat conduction state, a heat convection 
state and perhaps other (turbulent) states. For temperature gradient values beyond a 
critical transition point, the heat convection state is most stable, and it has the highest 
heat transport and the highest EP (see also [T7]). 

Making the analogy with our model, we will take a time dependence Cjj(T) as in 
section \3A\ This might give a non-linear dynamics, resulting into different steady states 
for the fluxes F*j. The hypothesis will be proven when the most (asjTiiptotically) stable 
state has the highest EP. Up till now, no proof of this hypothesis is known, and it is 
doubtful whether it is generally true. 

3.6. Microscopic MaxEP 

As a smaller final comment, our model demonstrates another kind of MaxEP principle, 
different from the above principles. One might look for the microscopic path which 
has the highest probability IQ. In our model, we can easily see that this path should 
have the maximum value of the action Ar^max = '^J^ij Cij\Xij\. This corresponds with a 
maximum of the microscopic path EP ar- This microscopic path EP does not necessarily 
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result in a maximum of the path-ensemble averaged EP a. Furthermore, there are some 
doubts that this 'microscopic MaxEP' is generally true (Maes, private communication). 
We have seen that the action Ar in our model is basically the time- antisymmetric part, 
which is the EP [12]. But in more general descriptions for other systems, there is also 
a time-symmetric part of the action [1^ . When this part also depends on the path, the 
microscopic MaxEP might be unvalid. 

Appendix: A MaxEP-MaxEnt analogy 

In section [3T3l we have described the partial steady state MaxEP principle with a simple 
example. The intuition of Dewar and others is that this MaxEP principle can be derived 
by maximizing the path information entropy in non-equilibrium statistical mechanics, 
the same way that Jaynes |6l [7] derived the Gibbs probability measure in equilibrium 
statistical mechanics, by maximizing the phase space information entropy. This method 
is called MaxEnt. 

Here we will discuss an analogy of this non-equilibrium MaxEP system with an 
equilibrium MaxEnt system, in order to clarify the line of reasoning used in this MaxEP 
principle. The analogy below is very rough, and definitely not a proof for MaxEP. There 
are a lot of theoretical problems with it, so one should not take it to serious. 

The non-equilibrium MaxEP: Take a system consisting of three compartments with 
two fluxes between them. Let us take the linear response regime, where these fluxes Fij 
have conductances Cij = c^^ relating the forces Xij = Fij/Cij. Suppose the conductance 
C23 is unknown. This means that also the steady state values (using fl34ti35p ) of Xij and 
Fij are unknown. MaxEP claims that they can be derived by maximizing the partial 
steady state EP (JHSD (1*3 = F;^{X^ - F^^/Cu). 

The equilibrium MaxEnt: Consider a closed system (energetically coupled with an 
environment), consisting of two closed boxes which are also energetically coupled. For 
simplicity, the volumes and heat capacities of the two boxes are equal to unity. The 
two boxes contain an ideal gas with particle numbers A^i and A^2 at temperatures Ti 
and T2 respectively. Suppose that a priori only Ti and the total number of particles 
= Ni + N2 are known and constant. The other variables and parameters are derived 
by MaxEnt. 

The following table represents the analogy schematically: 
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MaxEP 


MaxEnt 




energies' Fi Fo 


conductances: C12, C23 


temperatures: Ti, T2 




DPiTtiflp nnmHprs" /Vi A/o 


^ij = Fij/Cij 


iHp^il P"^m /^nnTo^rim/^tion " 


steady state: FT., = FXn 


energy equality: i^T = E^ 


non-equilibrium constraint: 
Xi2 + X23 = Xg is constant 


particle conservation: 
+ A^2 = is constant 


unknown: Fij,C23,Xij 


unknown: A^j, T2, E'j 


MaxEP^ C23,Max = C12 

TP* C12 

ij,Max 2 


MaxEnt^ T2,Max = 

p* _ AfOTi 
^i,Max 2 



Off course, one can always take a system with different conductances, so MaxEP is 
not generally true. A similar possibility occurs in the well known equilibrium statistical 
physics: When the two boxes in the MaxEnt system are energetically isolated, it is also 
not necessary that T2 = Ti. As energetic coupling is a necessary condition for temper- 
ature equilibration in the MaxEnt formulation, there should be an analogous necessary 
condition in the non-equilibrium system in order that MaxEP is valid. Once one can 
find this kind of 'coupling' in the non-equilibrium system, and once one can demonstrate 
that the path information entropy is (perhaps under some further restrictions) related 
with the partial EP corresponding with an unknown parameter, then one can give a 
best guess for this parameter. In this way, perhaps the best guess for e.g. the atmo- 
spheric heat conduction parameter in the Paltridge model is derived by maximizing the 
atmospheric EP. 

The above discussion might give a hint to explain why the experimental atmospheric 
heat transport is close to the MaxEP value. Dewar [2] correctly pointed out that the 
predictive success of MaxEnt hinges on having correctly identified the constraints. As 
the temperature equality in the two box system depends on the energetic coupling 
due to the absence of internal constraints (e.g. dividing isolating walls), the MaxEP 
heat transport value might perhaps also depend some coupling due to the absence of 
constraints (e.g. the conductances should be sufficiently variable). We end this appendix 
by repeating that the above ideas are still very speculative. 
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